Early-Stage Apple Leaf Disease Prediction using deep learning 


Abstract 

Premature leaf fall, scab, Alternaria leaf spot, brown spot, mosaic, grey spot, and rust are common 
types of apple leaf diseases. Due to the arrival of monsoon, there is excess moisture in the air 
because of an outbreak of diseases in plants that is being witnessed in the hilly region. Farmers 
from these regions are always worried about the health of Apple plants. The scientists working in 
various departments, Krishi Vigyan Kendra’s, and regional research stations have given the 
required inputs to control the problems but that is not useful to identify the problem in the early 
stage. Also, the current disease diagnosis based on human scouting is time-consuming and 
expensive. Our proposed system identifies various apple leaf diseases in an early stage that will 
alert the farmers and nearby research institutes to take appropriate action to control it. The dataset 
contains 1821 images of apple leaves which has normal leaves, scab, rust, and other disease 
infected leaves. The proposed regional convolutional neural network-based approach 1s capable of 
localizing and classifying the disease with 90% accuracy. 
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Introduction 


Agriculture plays an important role in the economy of every country. Produce produced by farmers 
must be in a proper condition to achieve expected profit. Advancements in technology must be 
used in this domain to save and make products market ready. But due to continuous change in 
weather and lack of technology access in this field, farmers are facing a huge challenge to protect 
their produce from different diseases that are unexpected and occurred at any time. 


Traditional way of disease detection is based on observation and time consuming which requires 
experts to be present on the field. Sometimes misdiagnosis of many diseases may cause harm to 
crops, products and consumers who are consuming the product. 


Artificial Intelligence (AI) plays a significant role in every vertical like agriculture. AI can be 
useful to solve most common issues in agriculture [1]. It can be used to identify various leaf 
diseases in an early stage. Using automatic plant leaf disease detection methods farmers will get 
help to reduce their losses and to improve the productivity. 


Various researchers have been working on this problem. Most of them tried machine learning, 
image processing techniques and various deep learning algorithms to diagnose the plant diseases 
[2-3]. 


We have used Deep Learning technique [4] which is a subset of AI to detect the apple leaf disease 
in an early stage. The proposed methodology generates masks around the affected area of the leaf 


and detects disease in an efficient way. We have used a confusion matrix as a performance measure 
for the CNN model and bounding boxes have been drawn closest to the affected region of the plant 
that shows the proposed model performs well. 


Literature Review 


Agriculture is the primary occupation not only in India but many countries depend on it. 
Researchers throughout the globe are trying to solve different problems faced by farmers with the 
help of the latest technology. 


Zhong and Zhao [5] have proposed three methods, namely multi-label classification, focus loss 
function and regression. These are based on DenseNet-121 architecture. Total 2462 images of 
apple leaf were used that contained six apple leaf diseases. The proposed method achieved 93.5% 
accuracy which is better than traditional multi classification techniques. 


To check without using CNN can we extract features using Shallow CNN with Kernel SVM and 
Shallow CNN with Random Forest, Li et al. [6] had compared these two methods with other pre- 
trained deep learning models on three different datasets maize, grape and apple. They had found 
that the above mentioned two algorithms performed well in terms of precision, recall and F1 score. 


Bin Tahir [7] used a plant village dataset to re-train the Inception V3 model using transfer learning 
and extracted features required for classification. These features were down sampled using a novel 
variance-controlled approach that finds how each pixel varies from other nearest pixels. It reduces 
redundancy from the features. The proposed method achieved 97% of accuracy. 


Many researchers are working on detection of leaf diseases by using different datasets based on 
deep learning techniques. Hu et al. [8] proposed deep CNN with multiscale feature extraction 
methods for tea leaf disease detection. Experimental results show that the proposed method gave 
92.5% accurate results. Iteration time required for the proposed model is less in comparison to 
VGGI16 and AlexNet models. 


Detection of leaf disease, manually is a very big challenge for farmers those who are new to 
farming. Subetha et al. [9] compared the performance of two algorithms ResNet50 and VGG19. 
Dataset was publicly available on Kaggle that contains 3651 real time images having four classes 
namely scab, healthy, multiple diseases and rust. As per results both architectures gave 87% 
accuracy. 


Singh and Misra [10] proposed an algorithm to segment the image that can be used further for 
classification of multiple leaf diseases. They had used a genetic algorithm to detect the leaf disease 
in an early stage. 


Proposed Methodology 


The proposed method is based on Mask-RCNN (Regional Convolutional Neural Network) [4]. 
Mask RCNN is an extension of faster RCNN. RCNN is an object detection algorithm which is also 
used for image segmentation and masking. It is a pixel level classification that determines which 
all pixels belong to which object. The Mask-RCNN architecture is divided into three parts. First 
pretrained model of CNN [11], second Region Proposal Network (RPN) and last 1s fully connected 
layer and output. Backbone to this algorithm is ResNet50 which is used to extract the features of 
images. Detailed pipeline of proposed method is shown in figure 1, below. 
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Figure 1. Proposed method based on Mask RCNN Architecture 


Initially we have used weights of pre trained model trained on plant pathology 2020 dataset. Last 
layer of this pretrained model architecture 1s a fully connected layer that has a softmax activation 
function. Feature maps from this model is used as an input to the Region Proposal network. 


Region Proposal network 

Region proposal is an area where objects can possibly be found. It uses CNN to find regions of 
interest using binary classifiers. CNN layers of Regressor plot bounding box [12] around possible 
objects and later by finding Intersection over Union we can decide which boxes possibly contain 
regions of interest. 

Region of Interest (ROI) can be calculated by dividing area of intersection by area of union. Once 
regions of interest get finalized, the next step is to have ROI Pooling. This step gets input from 
CNN as a feature map and Region of interests from regressor. ROI pooling is used to extract fixed 
size windows from feature maps that is helpful to extract labels as a final output. It will produce 
fixed size feature map from different size regions using max pooling and the size of max pool 
window will be of 7*7*512. 


Experimental setup 
Dataset 

The Plant Pathology 2020 dataset is publicly available sponsored by The Cornell Initiative for 
Digital Agriculture and FGVC7. It has 1821 leaf images. It contains 4 classes: healthy (516 
images), multiple_diseases (91 images), rust (622 images), scab (592 images). Figure 2 shows 
sample images from Plant Pathology 2020 dataset. 
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Normal Leaf Scab Rust Multiple disease 
Figure 2. Samples from Plant Pathology 2020 dataset. 


Image Annotation 

We have used VGG image Annotator to annotate each image manually [13]. Using this tool, we 
obtained annotation in .json file for train and test dataset. We have kept this file in the train and 
test folder respectively. 
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Figure 3. VGG image Annotator tool 


Train the ResNet50 CNN for feature maps 
We have trained ResNet5O0 CNN algorithms on The Plant Pathology 2020 dataset. We have 
considered feature maps of the images. Images have been resized to 224x224x3 and then divided 


into two parts: train and test. Train data contains 80% of overall images and the rest all are in test 
data. To fit the data properly we normalized the image pixel values. To reduce overfitting and to 
improve the overall performance of the model image augmentation [14] has been used which will 
help to generalize the model. During the training phase we have used Adam optimizer [15] with 
0.001 learning rate, categorical cross entropy loss function and batch size as 10. After training we 
have saved weights of the model in .h5 format so that it can be used to train Mask-RCNN. 


Training of Mask-RCNN 


During the training phase of Mask RCNN proposed methodology uses ResNet50 as a backbone as 
it takes less time than ResNetl01 or ResNextl01 due to less numbers of layers. Detection 
minimum confidence set to 0.90. Remaining all other parameters are kept as default to train the 
model. The algorithm is trained on NVIDIA GetForce RTX 2060 GPU with 16 GB RAM. 


Performance measure report of ResNet50 

We evaluated the performance of our model using a confusion matrix as shown in figure 3 and 
calculated accuracy, precision, recall and fl score. ResNet50 with Plant pathology lab 2020 
dataset gave 90% accurate prediction with 0.95 precision, 0.94 recall and 0.96 as f1 score. 
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Figure 4. Confusion matrix of our model 
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Figure 5. Training Performance of our model 
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Figure 6. Learning curve of our model 


Results of Mask-RCNN 


As the results shown in fig algorithm performs well and can predict the disease accurately. It 


annotates the object (disease patches) correctly with confidence. 
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Conclusion 
In this paper, we have implemented ResNet50 to obtain the weights that can be used as inputs to 


the proposed methodology. To validate the model, we have used a performance measures such as 
confusion matrix, accuracy, precision, recall and Fl score. The proposed model achieved 90% 
confidence to detect each disease class. It is able to mask the affected region of the leaf with 90% 
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Figure 7. Predicted classes of disease with masks. 


accurate labeling and can draw bounding boxes around the mask. The results indicates that our 
proposed model performs well with high accuracy and is able to detect apple leaf disease in an 
early stage in an efficient way. 


References 


[1] K.K. Singh, “An Artificial Intelligence and Cloud Based Collaborative Platform for 
Plant Disease Identification, Tracking and Forecasting for Farmers,” Proc. - 7th 
IEEE Int. Conf. Cloud Comput. Emerg. Mark. CCEM 2018, pp. 49-56, 2019, doi: 
10.1109/CCEM.2018.00016. 


[2] B. Liu, Y. Zhang, D. J. He, and Y. Li, “Identification of apple leaf diseases based on 
deep convolutional neural networks,” Symmetry (Basel)., vol. 10, no. 1, 2018, dot: 
10.3390/sym10010011. 


[3] Y. Guo et al., “Plant Disease Identification Based on Deep Learning Algorithm in 
Smart Farming,” Discret. Dyn. Nat. Soc., vol. 2020, 2020, dot: 
10.1155/2020/2479172. 


[4] M.R-cnn, P. Doll, and R. Girshick, “Mask R-CNN.” 


[5] Y. Zhong and M. Zhao, “Research on deep learning in apple leaf disease 
recognition,” Comput. Electron. Agric., vol. 168, no. October 2019, p. 105146, 2020, 
doi: 10.1016/j.compag.2019.105146. 


[6] Y. Li, J. Nie, and X. Chao, “Do we really need deep CNN for plant diseases 
identification?,” Comput. Electron. Agric., vol. 178, no. September, p. 105803, 2020, 
doi: 10.1016/j.compag.2020.105803. 


[7] M. Bin Tahir et al., “Recognition of Apple Leaf Diseases using Deep Learning and 
Variances-Controlled Features Reduction,’ Microprocess. Microsyst., p. 104027, 
2021, doi: 10.1016/;.micpro.2021.104027. 


[8] G.Hu, X. Yang, Y. Zhang, and M. Wan, “Identification of tea leaf diseases by using 
an improved deep convolutional neural network,” Sustain. Comput. Informatics Syst., 
vol. 24, p. 100353, 2019, doi: 10.1016/j.suscom.2019.100353. 


[9] Subetha. T., R. Khilar, and M. Subaja Christo, “A comparative analysis on plant 
pathology classification using deep learning architecture — Resnet and VGG19,” 
Mater. Today Proc., no. xXxxx, 2021, doi: 10.1016/j.matpr.2020.11.993. 


[10] V.Singh and A. K. Misra, “Detection of plant leaf diseases using image segmentation 
and soft computing techniques,” /nf. Process. Agric., vol. 4, no. 1, pp. 41-49, 2017, 
doi: 10.1016/j.inpa.2016.10.005. 


[11] Q. Zhang, M. Zhang, T. Chen, Z. Sun, Y. Ma, and B. Yu, “Recent advances in 
convolutional neural network acceleration,” Neurocomputing, vol. 323, pp. 37-51, 
2019, doi: 10.1016/j.neucom.2018.09.038. 


[12] R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies for 
accurate object detection and semantic segmentation,” Proc. IEEE Comput. Soc. 
Conf. Comput. Vis. Pattern Recognit., pp. 580-587, 2014, ~~ dot: 
10.1109/CVPR.2014.81. 


[13] “VGG Image Annotator (VIA).” Visual Geometry Group - University of Oxford, 
www.robots.ox.ac.uk/~vgg/software/via/. 


[14] “Data augmentation : Tensorflow core. (n.d.)”. Retrieved March 12, 2021, from 
https://www.tensorflow.org/tutorials/images/data_augmentation 


[15] D.P. Kingma and J. L. Ba, “Adam: A method for stochastic optimization,” 3rd Int. 
Conf. Learn. Represent. ICLR 2015 - Conf. Track Proc., pp. 1-15, 2015. 


